AITopics | personal identifier

Collaborating Authors

personal identifier

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Thunder-DeID: Accurate and Efficient De-identification Framework for Korean Court Judgments

Hahm, Sungeun, Kim, Heejin, Lee, Gyuseong, Park, Hyunji, Lee, Jaejin

arXiv.org Artificial IntelligenceOct-17-2025

To ensure a balance between open access to justice and personal data protection, the South Korean judiciary mandates the de-identification of court judgments before they can be publicly disclosed. However, the current de-identification process is inadequate for handling court judgments at scale while adhering to strict legal requirements. Additionally, the legal definitions and categorizations of personal identifiers are vague and not well-suited for technical solutions. To tackle these challenges, we propose a de-identification framework called Thunder-DeID, which aligns with relevant laws and practices. Specifically, we (i) construct and release the first Korean legal dataset containing annotated judgments along with corresponding lists of entity mentions, (ii) introduce a systematic categorization of Personally Identifiable Information (PII), and (iii) develop an end-to-end deep neural network (DNN)-based de-identification pipeline. Our experimental results demonstrate that our model achieves state-of-the-art performance in the de-identification of court judgments.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2506.15266

Country:

Europe (1.00)
Asia > South Korea (1.00)
North America > United States (0.93)

Genre: Research Report > New Finding (0.87)

Industry:

Transportation > Passenger (1.00)
Transportation > Infrastructure & Services (1.00)
Transportation > Ground > Road (1.00)
(10 more...)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(2 more...)

Add feedback

Using Text Injection to Improve Recognition of Personal Identifiers in Speech

Blau, Yochai, Agrawal, Rohan, Madmony, Lior, Wang, Gary, Rosenberg, Andrew, Chen, Zhehuai, Gekhman, Zorik, Beryozkin, Genady, Haghani, Parisa, Ramabhadran, Bhuvana

arXiv.org Artificial IntelligenceAug-14-2023

Accurate recognition of specific categories, such as persons' names, dates or other identifiers is critical in many Automatic Speech Recognition (ASR) applications. As these categories represent personal information, ethical use of this data including collection, transcription, training and evaluation demands special care. One way of ensuring the security and privacy of individuals is to redact or eliminate Personally Identifiable Information (PII) from collection altogether. However, this results in ASR models that tend to have lower recognition accuracy of these categories. We use text-injection to improve the recognition of PII categories by including fake textual substitutes of PII categories in the training data using a text injection method. We demonstrate substantial improvement to Recall of Names and Dates in medical notes while improving overall WER. For alphanumeric digit sequences we show improvements to Character Error Rate and Sentence Accuracy.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2308.07393

Country: North America > United States (0.04)

Genre: Research Report (0.50)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Protecting Sensitive Data in Analytics: A Data Engineering Perspective

#artificialintelligenceFeb-21-2023, 16:35:44 GMT

Our team has shared the most effective ways to keep data safe, including key techniques such as tokenisation, suppression and cryptographic encryption. Data-driven solutions help organisations make better decisions, improve efficiency, create better experiences for customers and ultimately bring in more revenue. But the growth of big data is outpacing the protection of such information. With the ever-increasing amount of data being collected, stored and processed, it is essential for data engineers to understand how best to handle personal information for analytics. Data engineers frequently spend their days striking a balance between two responsibilities: Harnessing large amounts of data involving sensitive/ personal data to innovate and drive change while also adhering to strict standards that govern how that data should be handled and used.

dataset, identifier, information, (15 more...)

#artificialintelligence

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence (0.96)

Add feedback

What is Data Anonymization?

#artificialintelligenceJun-27-2022, 16:05:56 GMT

Data anonymization is the process of mitigating direct and indirect privacy risks within data, such that there is a measurable way to ensure records cannot be attributed to a specific individual or entity. With an estimated 2.5 quintillion bytes of data being generated every day and an increasing reliance on data to power new applications, machine learning models and AI technologies, the importance of implementing effective anonymization techniques and removing any bottlenecks is crucial to accelerating future developments and innovations. This post is a general introduction to anonymization, and the tools and techniques for providing sufficient privacy protections, so that personally identifiable information (PII) is safe from exposure and exploitation. Data anonymization should be considered a continuous process; one that can require rapid iteration of applying various privacy engineering techniques and then measuring those privacy outcomes until a desired end state is reached. In the following sections, we'll dive deeper into our core tenets of the data anonymization process, and then walkthrough how you might apply them to a notional dataset.

anonymization, anonymization process, information, (15 more...)

#artificialintelligence

Country:

Europe (0.15)
North America > United States > California (0.05)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.56)

Add feedback

Using AI in the Transportation Industry

#artificialintelligenceDec-14-2021, 08:37:13 GMT

Many transportation sector leaders are looking to technology to overcome transportation-specific supply chain problems. Learn how using AI can help.

contractual obligation and administer, respective consent, transportation industry, (10 more...)

#artificialintelligence

Industry: Transportation (1.00)

Technology: Information Technology > Artificial Intelligence > Applied AI (0.60)

Add feedback

Automation of Data De-identification - John Snow Labs

#artificialintelligenceNov-5-2021, 12:48:06 GMT

With evermore personal data being produced and stored by organizations, data privacy is becoming an increasing priority. Businesses have access to a lot of sensitive information about their customers, service providers, and employees and are required to protect that data in order to minimize the risks of scams or fraud. De-identification is used to overcome data privacy challenges and keep information safe from unauthorized parties. This post explains what de-identification is, how it works and how natural language processing (NLP) is used to automate the process of removing sensitive data from datasets. De-identification is a technique used to remove any data that could identify a person from a dataset.

artificial intelligence, de-identification, natural language, (16 more...)

#artificialintelligence

Country: North America > United States (0.16)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)

Add feedback

Genuine Personal Identifiers and Mutual Sureties for Sybil-Resilient Community Formation

Shahaf, Gal, Shapiro, Ehud, Talmon, Nimrod

arXiv.org Artificial IntelligenceNov-21-2019

While most of humanity is suddenly on the net, the value of this singularity is hampered by the lack of credible digital identities: Social networking, person-to-person transactions, democratic conduct, cooperation and philanthropy are all hampered by the profound presence of fake identities, as illustrated by Facebook's removal of 5.4Bn fake accounts since the beginning of 2019. Here, we introduce the fundamental notion of a \emph{genuine personal identifier}---a globally unique and singular identifier of a person---and present a foundation for a decentralized, grassroots, bottom-up process in which every human being may create, own, and protect the privacy of a genuine personal identifier. The solution employs mutual sureties among owners of personal identifiers, resulting in a mutual-surety graph reminiscent of a web-of-trust. Importantly, this approach is designed for a distributed realization, possibly using distributed ledger technology, and does not depend on the use or storage of biometric properties. For the solution to be complete, additional components are needed, notably a mechanism that encourages honest behavior and a sybil-resilient governance system.

identifier, personal identifier, surety, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/978-3-030-60975-7_24

1904.0963

Country:

Asia > India (0.28)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Africa > Sierra Leone (0.04)

Genre: Research Report (0.82)

Industry:

Information Technology > Security & Privacy (1.00)
Information Technology > Services > e-Commerce Services (0.34)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Add feedback